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Abstract. We prove for the rescaled convolution map / ^ / ® / propagation of 
polynomial, exponential and gaussian localization. The gaussian localization is then 
used to prove an optimal bound on the rate of entropy production by this map. As 
an application we prove the convergence of the CLT to be at the optimal rate 
in the entropy (and L^) sense, for distributions with finite 4th moment. 



Section 1. - Introduction, Notation, Preliminaries. 

The Central limit Theorem (CLT) naturally leads to the analysis of the (nonlin- 
ear) rescaled convolution map, of a probability density with itself. Related maps 
appear in the study of Boltzmann type equations. A major issue is the convergence 
and rate in various norms for CLT. In this work, we will study the convergence in 
the strong norm L^, and the stronger sense of convergence in relative entropy. 

To find rate, we use monotonicity or entropy production estimates for the con- 
volution map convergence in this sense was first established by Barron [Bar]. The 
corresponding result for the Boltzman equation was established by Carlen, Car- 
valho and Wennberg [CCW]. Such estimates have also allowed, via the method of 
[CS] to prove the CLT for dependent variables, in a nonperturbative way. 

Our main tool is an optimal entropy production rate for the convolution map; 
such estimate depends critically on propagation of localization; to successfully 
apply then entropy production bound, one needs to show that the localization at 
infinity is not spoiled under iteration of the convolution map. We prove in sections 2 
and 3 that polynomial exponential and, most importantly, gaussian localization are 
uniformly propagated the convolution map. These results are then used to derive 
the optimal entropy production bounds in the gaussian case, and as application 
gives the optimal I/a/ti convergence of the CLT in the entropy, and norms, for 
gaussians (or better) localization, as well as the case of bounded moments to order 
4. 
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Propagations of localization are important for other applications. For example, 
gaussian propagation of localization for the Boltzmann kernel would have major 
implications to asymptotic stability and more. [CC1,2, CGT, CELMR, Des, De94, 

GTN] 

We conclude with some mention of possible applications. Our proof of the prop- 
agation of localization in the polynomial and exponential cases is rather direct. In 
the polynomial case it follows from moment estimates and in the exponential case 
by direct estimates on the generating function. 

The Gaussian case is however much more difficult. It is based on a kind of 
asymptotic log concavity in the CLT, combined with a theorem of Brascamp and 
Lieb, and other analytic arguments. The estimates of entropy production uses 
linear approximation theory of the map, combined with gaussian propagation of 
localization to arrive at the leading entropy growth term. The propagation of 
gaussian localization, which is crucial for getting the optimal convergence rate for 
the CLT, is based on upper AND lower bounds on the distribution p. Hence, if the 
distribution has a thin tail, it results in delocalization of the entropy, which breaks 
the needed estimates. This problem is usually overcome by assuming, on top of the 
localization, a spectral gap assumption [BaBN, Bart, Jon, Vil ]. 

We use a new construction to overcome this problem, thus avoiding the assump- 
tion of spectral gap, and extending the optimal convergence rates to arbitrarily 
gaussian localized distribution, with finite Fisher information. 

As we shall show, if a density p has most of its mass localized in the sense 
of having sufficiently many moments bounded, and if we are given a bound on 
the Fisher information of p, then the tails of p do not contribute significantly to 
the the entropy of p, not to the entropy production by rescaled contribution of p. 
Without the bound on the Fisher information, this would not be the case at all. 
But since abounds on Fisher information are rescaled by iterated convolution, this 
opens the way to the following strategy for dealing with possibly thin tails: We 
approximate pn by a new distribution, pn, which is obtained by stitching a gaussian 
tail to p, for \x\ > c^fn, and renormalizing the mean and variance. Then, we show 
that the monotonicity estimates are optimal for the stitched distribution, and the 
difference to p is exponentially small. The effect of the small errors is absorbed by 
the monotonicity (entropy production) bounds, similar to the way perturbations of 
the convolution map were treated in our paper [CS]. 

Our notation and preliminaries follow closely the paper [CS]. Here we briefly 
recall the main ingredients of entropy /information bounds. [CS, Dem, Lie78, Lie89, 
Bar] 

Let X be an MJ^ valued random variable on some probability space. Let p, denote 
the law of X. If dp{x) — p{x)dx, we say that X has density p{x). m{x) stands for 
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the mean of x, and mj{x) for the j the moment of X. The variance is then 

[a{x)f = E{\X-m{x)\'') 

and X has variance 1 if [^(x)] is the identity matrix. Let gt denote the centered 
Gaussian density with variance t: 

gtix) = {27rt)-^/^e--"/^\ 

9 = 91- 

The entropy of p is 

S{p) — — j p \n pdx 

and the relative entropy of p is 



g{x) V 9{x) 

By Jensen's inequality D{p) > with equality just when p = g. Clearly, if p has 
mean zero and unit variance 

-oo < S{p) < S{g) 

and the upper bound is saturated only when p = g. 

Moreover, for p with mean zero and unit variance, which we will refer to as p 
being normalized, 

D{p) = S{g)-S{p). 

For centered density p with o"^(/o) = Tr[a{X)'^] (m- the dimension) and -y/p G 
H^(W^), the Sobolev space, we define the Fisher information 

I{p) = 4 / \V^/^fdx 

and the relative Fisher information, J(p) as 

^(P) = 4 J \{V + ^)^)\'dx 

Clearly, J(p) > with J(p) = <^ P ^ g- 
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Also, note that, when J{p) < oo, 

J{p) = I |Vlnp(a;) — Vln(7(a;)|^p(a:)(ia;. 



The origin of the convolution map is the following: Suppose Xi , X2 are two inde- 
pendent random variables with densities pi,P2- For < A < 1 denote the density 
of \Xi + (1 — A^)^/^X2 by pi * p2- One computes 

l/A 

Pi * P2(„)= I pi(A«-(l-A2)i/\)p2((l-A2)i/2« + A^)dv. 

Let / be a bounded measurable function on W^. Define the operator P^, t > 

Ptf{x) = Ef{e-'x + (1 - e-2*)i/2G) 

Then Pt is a contraction semigroup on each Lp{MJ^, g{x)dx) 1 < p < 00. denotes 
the adjoint in L^{W^, dx). In particular, if X is a random variable with density p 

PtPix) is the density of e'^X + (1 - e'^^y/^G. 

We have the following relation between entropy and information, which is contained 
in [CS]. 

Lemma. Suppose p is a centered density with cr^{p)- Then t S{Pfp) is contin- 
uous and monotone increasing on [0, 00) with 

hm S{P:p) = Sig). 

t— >-oo 

Furthermore, when S{p) > —oo,t S{P^ p) is continuously differentiate on (0, 00) 
and 



and 

"OO 



s{p:p) = s{p) + f j{p:p)ds 

Jo 

POO 

D{p) = / j(p;p)dt. 
^0 



We will also use the inequality 



D{x) < ^ix) 
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due to Stam [Sta] which is equivalent to Gross's logarithmic Sobolev inequality 
[Gro], [Ca]. 

The proof follows from 

poo POD 

D{x)= / J{e-'X + {l-e-^')^/^G)dt< / e''^' J{X)dt. 
Jo Jo 

using the Blackman-Stam inequality: 

J{e-'X + (1 - e-^'y/^Y) < e-V(X) + (1 - e'^*) J(y). 
We also have the KuUback-Liebler inequality 

l|p-S'llii(M-,dx) < 2L»(P)- 

The main inequality we prove for entropy production is that under favorable as- 
sumption on both smoothness and gaussian localization of p, 

S{p * p)-S{p)>CD{p). 

Our previous work only gave a lower bound of the form $p(J(p)), [CS]. The ap- 
plication of this inequality requires that localization and smoothness is maintained 
under repeated iteration. So, for this we prove that gaussian (polynomial and 
exponential) localization is uniform in n for 

Pn — P * P ' ' ' * P-i n times. 

We now state the main theorem with convergence rate: 

Theorem (Optimal Entropy convergence). Let p be a regular, normalized, 
variance 1 and with hounded 4th moment distribution: 

I{p) < oo, 
||p|a;|'*||i < c < oo. 

Then 

(5.2) \D{pn)\<c/N, 

and AT := 2"'. In particular, the CLT holds in the Entropy (and L^) sense with the 
optimal convergence rate Xjy/n. 
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Section 2. Propagation of Localization I - Polynomial and Exponential. 

Let p be normalized distribution, localized exponentially: 

(2.1) p^{x) = X{a)-^e^-p{x) 

with Pa{x) bounded and in L^: here \{a) is the normalization constant so that 

(2.2) j pa{x)dx=l. 
Therefore 

(2.3) \{a) = j e""p(a;) 

Theorem 2.1 (Exponential Localization). Let p he a distribution in and 
such that \{a) < oo for a < A, A > 0. 
Then 

(2.4) Lp„ {a) = J e"''pn{x)dx = J e"^x/^p * • • • * p{^x)dx < 2e°'^ 
for all a < A. 

Remark. The above identity, of equation 2.4, is due to Cramer [Cr]. 
Proof. First we compute the convolution 



Pa * Pa = A(a)-2 / e"^--y^p{x - y)e^yp{y)dy 

(2.5) 

= X{a) 2 / e"^p(a; - y)p{y)dy = (p * p). 

Therefore since 

Pn = V^P * p • • • * p{y/nx), n times 

we have 

-^Pn('^) = J e"*v^p * • • •p(Vna;)(ia; 

(2.6) = J p * ■ ■ ■ p{y)dy 

a 



E. CARLEN AND A. SOFFER 



7 



by (2.5). 

Next, we expand Lp{^) around zero, to get 



for some < b < 



\Lf\b) \ = I J a;VV(a;)cZx| < CeLp{b + e) 



and we always choose b + e < A. 
Finally, 

2 

Lp^{a) = (1 + ^ + Cen-2/2)" ^ e«'/2 as n ^ oo 
so Lp„ (a) < 2e"'/2 for all n. □ 

Theorem 2.2 (Polynomial Localization). Assume for Nq fixed, Nq > 2 

(2.7) J \x\^°p{x)dx = Mno{p) < do < oo. 

Let Pn be the normalized n-convolution as before. 
Then, there exists d> such that 

(2.8) MjVo(Pn) < d{No,do), uniformly in n. 



Remark. Similar results with weak localization were proved in [CS]; they are opti- 
mal in the conditions of localization, where Lindenberg type condition is used. The 
proof for such weak localization is more involved. 

Proof. Consider first Nq — 2k,k integer. It is enough to consider the even case of 
distribution. 

So let A; = 2, p, r/ even: 

4, = /.SKcosfe +sin^,)p(-sinfe +cosWd.d, 
^^■^'> = J{ucos9 + v sin dfr]{u)p{v)dudv = cos^ dM^ip) + 6 cos^ d sin^ 9 



8 PROPAGATION OF LOCALIZATION AND OPTIMAL ENTROPY 

where we used evenness, and the fact that M2(p) = M2{ri) — j x^p = J x'^ij = 1. 
(Recall that we always assume that M2{p) = 1). 
Completing to squares, we get from (2.9): 

(2.10) 

h,e = {Ml^^{r]) cos^ 9 + Ml'^{p) sin^ Of + 2(3 - Ml'^{p)Ml'^{r])) cos^ 6'sin2 9. 
For the gaussian distribution g 

Therefore, if M^{p),M^{ini) < 3 the M4 moment increases under convolution to 
approach 3. 

On the other hand, if both M4 are larger than 3, then 

M4(p2) decreases, so 

(2.11) M4(p2) < max{M4(p), M^{ti)}. 

By Jensen's inequality 

Ma{p) > (^J x'^pdx^ > 1 

so that 

Ml/\p)Ml/\rj) > mm{Ml/\p),Ml/\rj)} 

and hence 

3 - Ml^^{p)Ml^'^{r]) > only if 
max{My'(p),M]/'(r/)}<3. 

We conclude that 

M4(p2) < max{M4(p),M4(r7),9} 
P2 = P*V- 

After iteration, we therefore get 

M4(pn) < max{M4(p),M4(?7),9}. 
In the case > 2, arbitrary we have in a similar way 

x'^^r]{d)p{e)dxdy = M2fc(p) cos^'^ 9 + M2kiv) sin^'' 9 + Rk 
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where Rk are lower order moments (in powers of k). And as before, we estimate 
the above equahty by 

< {C2kCOsH + C2kSm''e) = C2k 

with 

C2k = max{M2fe(p),M2fe(?7),C2fc_i}, 

from which the result follows. 

The general case now follows from the following Proposition (2.3) □ 

Definition. For a random variable X, we define the ■^-function of X as 



i;{R) = El{x>R}X^ = [ x^p{x)dx 

J\x\>R 



^^-expectation, 1{a} is indicator function of A. 

Proposition 2.3. Let {Xj}JLi be an i.i.d. sequence of random variables with p 
finite moments, uniformly in j, in the integral sense: 

iPj{R)<i;{R) 

and 



/CO 
i;{R)RP-^dR <C^<oo. 



Here il^j{R) is the ip- function of Xj. 

Then, for any £ > 0, there exists a constant C, depending only on and e such 
that 

(2.12) (1^2"^-") <C(Cv„e) 

where 

1271 — 1 t2''^~^ 

= — -'^i + — X/ ^J+2"-l- 

Proof. We prove it only for the normalized case where all variances are 1. 
Let 2A; < p < 2A; + 2 be given. 



(2.13) Z2.=2-/2Xx,=2-/M5]t/,+5]y, 



2" 2"^ 
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with 



Then 



(2.14) 



^3 = Xjl{Xj<K}- 

+ 2{2-/^^y,}^ 



The second term on the r.h.s. of (2.14) is bounded by 2ip{K) and the first term is 
controlled by Holder's inequality: 

first term <P{\Z\> R)t^ {M2k+2{U))^ 
U = 2-^/2 J2 ^3 

M2k+2{U) < CM2k+2{Ui) < C'ir2/c+2-p 

by the even case, where C is the p-th moment oi Ui 

Pi\Z\ >R)< R-^ipiR). 

Combining all this we get 

(2.15) ^z.r. (R) < Ci?-2fe/('=+i)V;(i?)^K^-^ + 2ijj{K) 

Now, choose K = R in (2.15), to get 

i^z^niR) < CR^ij{R)T^ +2ij{R). 

Multiplying by RP~^~^ and using Holder's inequality again, the result follows. □ 

Section 3. Propagation of Localization II - Gaussian. 

Now we assume that p is gaussian localized, normalized distribution: 



2 

|e'^'^ p{x)\ < Cq for some c > 0, \x\ oo. 



We use * to denote convolution and ® to denote the normalized (rescaled) convo- 

* 

lution: ® = \/2. 
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Theorem 3.1. Let p be as above and assume furthermore that 

(3.1) p = gF 

and F is logconcave (InF is concave). 

Then pn — \/rip * ■ ■ ■ p{^/nx) is gaussian localized, uniformly in n. 

Proof. By Brascanip-Lieb we have that: 
p®p^gF2 

(3.2) ''®''=y»'^>*'^'^'^>^<^*'''' 

= SW / g(y)F(^)F(^)dy = gF, 

with F2 logconcave. 

Next, we need the following proposition 

Proposition 3.2 (Brascamp-Lieb). 

For g Gaussian, 

(3.3) / x^'^'gFdx < [ x^'^gdx 



when J gFdx = 1, and F logconcave. 
/,From this proposition it follows that 

(3.4) / e'^'^^gFdx < I e^""^ gdx 



Since in our case p„ = gF^, we get 

pI = g^Fl = ( j g^Fldx){ J g^F^dx)-'g'F^ = \\Pn\\hg^F 

F logconcave (since F^ is logconcave). 
Hence, 

Je^^'pldx<\\pn\\hje^^'g'dx 

□ 

Remark. If p is regularized as p ^ pt = p ® gt we have 



< 00. 



(3.6) / e^-Vt,n = / Pn ® ^te^"' = / pngt * e^"' = / Pne^*^' 



with Pt ~ /5. 

It remains to show that, sufficiently smooth gaussian localized p, will have the 
form gF after sufficiently many iterations. 
Next, we demonstrate such cases: 
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Theorem 3.3. Let 

(3.7) p = (27r)-^/2 exp(-a;V2) + pix) 

and assume that 



I 



(3.8) I / e"^p(a;)da;| < Ciel"l ,£ > 0, 

and p smooth. 

Then, for n sufficiently large, pn = gF with F logconcave. 
Proof. Let, as before 

p« = Xia)-'e^-pix) 



X{a) = J e"^p(x)(ix 
we have a fower bound on X{a): 

SO, by (3.8) it foUows that 

(3.9) X{a) > ^(e'*'/^ - c) for a > ao{Ci,e) 

where ckq is approximately (Znci)^, some (3 > 0. 
Now, 

X{a)-^ J{x- m„)'^e""p(x)(ix = X{a)-^ J {x - ma)^(27r)-^/ V"'^'e""(ia; 
+ A(a)-i lix- m«) V"p(a;)(ia; 

= Il+l2 

(3.11) 

Ii = {2n)~^^'^ J ^^"^ ~ "^^^ ^^"^ ~ ci;p(mQ, — ck)^ + {ma — ck)^ + odd terms }A(q;)' 
X e~^^'^~°'^^ dx 

< {3 + 6(ma - af + (m« - a)^ + 0}2e"'/V(e"'/^ - C'l). 
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Therefore Ii remains bounded uniformly in a, if \ma — (x\ < Cq uniformly in a. 
Furthermore, I2 is small when a is large, by our assumptions on p{x). 
Now, 

nia = X{a)~^ J xe°'''p{x)dx = a + X{a)~^ J xe°'^p{x)dx 

= a + 0(a"^) 

which implies that the r.h.s of (3.11) is uniformly bounded in a. To conclude, 3.10 - 
3.11 implies that the fourth moment is uniformly bounded; and the second moment 
is close to 1. 
Next, 

-^Pa = Pa = nice + X{a)~'^e'^''{ap + p'{x)) 
where nice stands for terms which are uniformly bounded in a, so, 
(3.12) Wp'Jh < II nice + A(a)-2||e°'(Qp + p'(x))|||,. 

/|e«(„p.p')r = /e-(p'f^-/e-W^ 
SO, to prove uniformly of a bound on (3.12), in a, we only need to bound 

A(a)-2 J e^'^'' {p'fdx < C, uniformly in a, 

which is implied by our conditions on p. 

(n) 

Now, taking the n-th normalized convolution of Pa, Pa we know by the polyno- 
mial propagation of localization, Thm 2.1, and by the entropy production bounds 
of [CS] that 



We use that convolution improves or preserves the smoothness of p, therefore we can 
take p to be independent of pu- see [CS]: The function $p was obtained thorough a 
compactness argument, and was not computable. On the other hand, we were able 
to show that $p(t) was strictly increasing as a function at t, and hence $p(t) > 
data p. Moreover $p(t) depended on p only in a way that was invariant under the 
convolution map, so that the same function $ could be used at each stage in the 
treated convolution. This act was crucial in our application which requires us to 
absorb the effect of dependence. 

In this paper we will estimate $p. We will place more restrictive conditions on 
p, but shall obtain quantitative information on $p in return. 

Hence, p^^ converges to a gaussian in entropy, S, and so in L^. By smoothness, 
all derivatives also converge, uniformly in a. 
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Now, it follows that for n > uq. 

-anpi-))"U=o>l-£ 
and since, moreover a — )■ rua covers M, we have that 

-(Znp("+^))" >l-£forall x. 

Hence, 

p(-+i) = p,+i = e-(i--)-V2^ 

with F logconcave. □ 

Remark. If p is not smooth, then we apply the theorems to p — M ® p with M 
gaussian. For such p the condition on p' is satisfied whenever we have the bound 
3.8, since 

p' = M' ®p. 

Furthermore, the gaussian localization of (M ® p)n implies that of pm since 

{M ® p)®{M ® p) ^ M ®{p® p) 
so, since M is well localized, p ® p® ■ ■ ■ p is well localized whenever 
{M ® p) ® {M ® p) ■ ■ -{M ® p) \s weU localized. 

Section 4. Entropy Production. 

In this section, we prove optimal entropy production bounds for the convolution 
map. 

Recall the following formula for the Entropy production by convolution [CS] 

/•OO 

(4.1) S{p®p)-S{p)= / J{pt®pt)-J{pt)dt 

Jo 

where S is the entropy and J is the relative information. 

Pt is the map, up to time t of p under the Orenstein-Uhlenbek process. 
Also from [CS, Bar] we have the following bounds 

(4.2) iVv^l' < BtP;p{x) 

which, by the way of the localization of p implies that | Vy^p is similarly localized. 
Also, recall the definition of the ^jJ function 

ip{R)= / x^p{x)dx. 

J\x\>R 

Define 

J\x\>R ^ 
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Lemma 4.1. 

(4.3) Jr{p) < 2iP{R) + 8(1 + RY^BtP^iPiR). 

Proof. Follows from (4.2) and the definition of il^{R)- 
Lemma 4.2. 

(4.4) p:^1,{R) < tl^p{R/2) + tl^g{R/2) 
Proof. See [CS] 

We can now state the main entropy production bound : (see CCl, CS for similar 
results with weaker nonlinear (lower bounds) in -D(p), in the case of Boltzman 
equation and the CLT, respectively. However, those results do hold for general p; 
i.e. finite variance and finite entropy are the only conditions imposed.) 

Theorem 4.3. Let p satisfy J{p),S{p) finite, p smooth, and have a finite second 
moment. 

(1) Suppose that K > g/ p>l/ K for some constant K. Then 
(4.6) S{p®p)>^D{p). 

(2) More generally, define R^ so that 

2ijp{R,) + 8(1 + i?2)-i + iPp{Rj2) + iPg{Re/2) < J{p)/2 := e . 
Suppose that g/p is bounded below by by on the hall of radius R^. Then 
(4.6b) S{p®p)>CeD{p). 

where depends only on e and i/jp. 

Remark. The constant Cg depends on the localization of the relative Fisher in- 
formation, and the distance of the distribution p from the normalized Gaussian. 
Therefore, an estimate with known, uniformly bounded constant, would require con- 
trolling such quantities. This follows when we have propagation of Gaussian lo- 
calization, as in Section 3. Alternatively, one may expect to prove propagation of 
localization for the relative Fisher information, which we do not have. In Section 
5, we use a new construction (stitching), to obtain uniform bounds for Cg. 

Proof. 

li p — g there is nothing to prove. 
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For p g, J{p) > 0. So assume J(p) = e. We now choose R so large that 



R{s) is fixed by 
(4.7) 



Jr{p) < ^J{p) 



2t/;{R) + Cti^{R)/{l + R^) < e/2 



with ip = Pfip. 

Next, we use the lower bound, proposition (4.4) below: 



(4.8) 



J{pt) - J{pt ® Pt) > Fa.r 



M{E 

c.d 



d T 

— In pt{-G) + cG + d 
ax a 



= inf / \V In pt(x) + cx + d\^g(x)dx 

c,d J 

> / \V In pt(x) — c*x — d*\^g{x)dx 

J\x\<R{s) 

for some c*,d*. 

This last expression is then equal to 



-L 



a;|<fl(e) 



Pt{x) 



pt{x)dx 



Pt{x) 



with Q = Vlnpt{x) - C*x - d* , 



and we also have 
(4.9) 



\V In pt{x) - c*x - d*\'^p{x)dx < e/2. 



\x\>R(e) 

Finally, (4.8) and (4.9) imply 



(4.10) 



J{Pt)-J{Pt®Pt) > II ^llLi(H<fl(e)) 



- J \ S/lnpt — c*x — d*\'^ptdx 



The theorem now follows from this last inequality and (4.1). □ 
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Proposition 4.4. 

/d 
\—lnpt{x) + CX + d\'^g{x)dx}. 
ax 

Proof. Introduce the convolution operator Cp^g 

Cp,ef = J f{{ei,Re{x,y)))p{y)dy 

where (, ) is the scalar product in M^, ei = (1, 0) and Re is rotation in by 9. 

Cp,e : L\p) ^ L\p) 

for any p = g,g gaussian and 9 = e~*, Cp^t becomes the Orenstein-Uhlenbek 
process. 

In this case Cp^e is self-adjoint and its eigenvalues are cos"' 9. 

In general Cp^e is not bound on and is selfadjoint only for p = g. 

Let Ilj denote the projection on the subspace of the first j eigenvectors of Cp^e- 

Uj + ILj = 1. 

Now, consider 

l0 = J J \h{x) + h{y) -h{Re{x,y))\'^p{x)p{y)dxdy. 

The following lemma is essentially due to Brown [Br]. See [CC2] for an adaptation 
to the Boltzmann equation setting. 

Lemma 4.5 (Linear Approximation Lemma). 

(4.12) le > Ce inf I \h{x) -ax- h\'^p{x)dx. 

a,b J 



See [Br]. Here we use it with 9 — it/A. 
Section 5. How to deal with thin tails. 
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Lemma 5.1. Let p be a probability density with I{p) < oo. Then for q > 1 and 
R>0, 

I p\x)dx < i{py~^ I / p{x)dx 

J{\x\>R} \J{\x\>R} , 

Proof: Let / := Using the bound ||/||^ < 2||/||2|| V/II2 for functions on i?, 
/ p'^{x)dx= \ ff^'i-^\x)dx<\\ p(a;)da; I (2||V/||2)^^^"'^ 
Recall that 2||V/||2 = \fT{p)- □ 

Lemma 5.2. Let p be a probability density with I{p) < 00 and finite second mo- 
ment. Then 



[ p\ \np\dx < 2/(p)i/2 ( / p{x)dx] + ^\ [ p(l + \x\^)dx] 

J\x\>R \J{\x\>R} J ^ \J\x\>R J 



1/2 



Proof: Fix any r > 0. On the set {p > 1}, 



p| Inpl = plnp < -(p^+^ - p) < -p^+i 
r r 



By the previous lemma, 



[ p|liip| < ~I{py \ I p{x)dx 

J {p>l}n{\x\>R} f \J{\x\>R} ^ 



On the set {p < 1}, 



11 1 
p| Inpl = pin - < -(p^~^ — p) < -p^ 
p r r 



Therefore, by Holder, 



{p<l}n{|a;|>K} 



p Inp < 














'\x\>R 








< 




\l\x\>R 









l-r / „ ^ ^ ^ ^-1 



1-r . 



<X> 



-1/r 



Choosing r = 1/2, we obtain the result. □ 
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Proposition 5.3. Let p be a probability density mean zero, unit variance, I{p) < 
oo and finite third moment. Let 

Pn = P ® P ■ ■ • ® P n — times . 

Then there exists a constant c such that for all n, 



I \pn - g\dx < ci?2-"/2 . 

J\x\<R 

[ \pn/g-l\dx<cRe^'/'^2-''/'^ 

J\x\<R 



\x\<R 

Proof: See Feller or Major 

We are now ready to define the stitching operations. 
Recall the definition 

p2n ■= V^Pn-l * Pn-l(\/2x), 

with J x^pn = J x^pq = 1. We further define N := 2". Then, we let, for some fixed 
c> 0, 

1 _xi 

where 

with a nonnegative mollifier function /iq, satisfying: /iq > 0, /iq G Cq^, Support of 
/lo G [—1, 1], / /lo = 1. Here Ib denotes the characteristic function of the set B. We 
then normalize : 



such that 



/ 



X Pr 



XPn = 0. 

Writing c„ = 1 + e^, (in = 1 + = 1 + ^n? it follows, by an application of the 

local central Limit Theorem, and localization, that the e^'s tend to zero, as n goes 
to infinity. 
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Proposition 5.4. Let S denote the entropy functional, as before, and Pn,Pn de- 
fined as above. Then, 

S{pn)-S{pn) = r{n^/^)-\ 

r{k) tends to infinity as k goes to infinity. Moreover, if pi is polynomially localized 
to order 2m + 2, then r{k) grows like k^; for p\ exponentially localized, r{k) is 
exponentially growing in k. 

Proof: 

S{pn) - S{pn) = (pn In Pn - Pn In Pn) + 

J \x\<Cy/n 

where 

Rn= (pn In Pn - Pn In Pn). 

J \x\>c^/n 

If pi is polynomially localized, to order 2m, (respectively, exponentially localized), 
then by our previous results on propagation of localization, in these cases, the 
localization persists, uniformly in n. Since the range of integration in the Rn term 
is \x\ > c^Jn., the bound Rn = r{n^/'^)~^ follows. 

It remains to control the other part of the integration region. In this region we 
have that: 



P 



n 



and therefore, 
/ (Pn In 

Pn Pn In Pn) 

J \ x\<Cy/n 



/ (pn In Pn - (1 + en) Vn) [In pn - ln(l + en)] 

J \x\<c^/n 



= I pnlnpn(l - zr^ — ) 

J\x\<c^/n i- -r 

en) Pn 

ln(l + 

J\x\< c\fn 



=\/n 

Since the entropy is uniformly bounded in n, and the pn are all normalized to 1, 
the proof follows, if we show that 

en = r(n^/^)-^ 

This last estimate follows directly from the definition of the stitched distribution: 

/Pn = Pn + Rn = Rn + Rn + 1- 

J\x\<c^ 

Similar estimate holds for for the other e's. 
□ 
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Proposition 5.5. 

Let N be defined as before, for any fixed n. Assume that p satisfies the nor- 
malization conditions as before, and furthermore it is Gaussian, exponential or 
polynomially ( of order p > 4:) localized: 

lle^'^Vlloo < 1, b>0. 

||e^l^lp||oo < 1, b>0. 

\\\x\Pp\\i <b, b>0. 
Let p be the associated stitched distribution as defined before. Then, 

S{p2N)>S{pN*PN)-r{VN)-\ 

Proof: 

Using that 

S{p) = sup p(f)dx — In e^dx^ , 
and choosing — pn* Pni ^ arbitrary, we arrive at: 

S{p2n) > S{pn* pn) + j P2n ln(pn * p„) 

= / (P2n - pn* pn) ln(p„ * pn) 

J\x\<cs/nl1 

+ / (P 2n - Pn * Pn) ln(pn * Pn) 

J\x\>2c^l'l 

+ / (P2n - Pn * Pn) ln(p„ * p„) 

J c^fnl2<\x\<1c^/n 



\x\<cs/nl'2 
\x\>2cy/n/2 

r 

'cyn/2<|a;|<2cVn 

- ) 

\x\>2cy/n 

B 



= S{pn* pn) + - / (x^/2)(p2n - Pn * Pn) + B, 

J \x\>2c^/n 



■= ip2n- Pn* Pn)^T^{Pn* Pn)- 

J c^fnl2<\x\<2c\/n 



' c^/n/2<\x\<2c^/n 

We now use this last inequahty with n replaced by := 2"'. 
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Then, we choose 2c < cq, so that for n > ATq, we have that pn > e~^^/^ for 
\x\< 2cVW. Hence 

B<CN I {p2N - PN * pn) < cie-^^ 

J2c>\x\>cVN/2 

since, by the pointwise CLT, for such x, we have gaussian focahzation. 
Finally, 

_{x^/2){p2N -PN* Pn) = r(ViV)-^ 



/ 

J\x 



\x\>2cVn 
□ 

Proof of the Main Theorem-I 

By the above proposition we have that: 

S{p2n) > S{pN * Pn)- 
r{VN)-^ 

>S{pn) + HS{pn\9))- 
r{VN)-^ > S{pn) + HSipN\g)) - r{VN)-^ 

The proof of the main theorem , namely that S{pn) — > S{g) + r{\/N)~^, follows 
from the following: 

Theorem 5.6. For p Gaussian localized as above, and for all n large enough, we 
have: 

cig < pn< C2g, 
0<S{pr,*Pn\g)<{l-c)S{pn\g). 

c depends on ci,C2, and < c < 1. 

The proof of the above theorem follows from the construction of p and our 
previous estimates on entropy production in the Gaussian localized case. 
Completion of the Proof of the Main Theorem 

The proof now follows, since we can replace ^{S{pN\g)) by c{S{pN\g))-, c is 
strictly positive, uniformly in N, since ci , C2 can be chosen uniformly in N, for all 
N large enough. □ 

Then, the relative entropy satisfies, under favorable localization conditions 



(5.1) 



D{p2n)-D{pn) > doD{pN). 
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^From this, we immediately conclude that the relative entropy converges to zero, 
exponentially fast in N. 

This is the basis for the argument giving an optimal convergence rate in the 
Entropy sense, for localized initial distributions p. 

The inequality (5.1) is the crucial inequality, proved in sections 3, using the 
propagation of localization for gaussian localized p. The MAIN THEOREM 
now follows: 

Proof. Since p is gaussian (or exponentially or polynomially) localized and smooth, 
we see that p satisfies the conditions for Theorems 5.4,5.5,5.6. 
Hence, either (in the gaussian or exponential case) 



(5.3a) J pnB^^^^dx < c < oo, independently of 



or. 

Next, we apply Theorems 4.3,5.4-5.6 to pN to conclude that 
(5.4) D{p2n) - D{pn) > CsD{pn) - r(v^)-\ 

with 

(5-5) = ^ll~llLi(fl(£))- 

Due to the propagation of localization (5.3), we see that ||^||L°°(i?(e)) < oo, 
uniformly in N and hence Ce > S > uniformly in N, which implies that 



\D{pn)\ < r{y/N)-^ + 0{1/N). 



□ 
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